The will to end one’s life is commonly believed to stem from various factors such as financial stability, health issues, physical environment, etc. One can easily find multiple data sets on the internet that accurately report the number of suicide cases reported for various prominent cities around the globe and their correlation with numerous factors such as GDP per capita and Life expectancy. These observations give us an objective understanding of these intuitive relations we wish to explore.
The happiness index measures content among the residents of various world countries. The happiness scores are calculated based on a population survey. They are expected to be a function of multiple parameters such as average life expectancy, family income, family size, etc. Various agencies differently model this index, but it heuristically represents the general content with life that people feel in places around the world.
With an intuitively expected relationship between suicides and happiness, this report aims to analyze and check this intuition based on objective metrics and actual world observations.
We primarily use three data sets for all our analysis. Namely:
Suicide Rates for Prominent Countries (1990-2015)
2024 Urban Bliss Index (2024-2027)
Note: This is a predictive data set that uses different parameters to predict the happiness values for various regions for the upcoming years. We use this data to understand how happiness scores are modeled and do not use this data in our analysis.
(Source: Kaggle Datasets)
For various continents, let’s analyse the impact of GDP per capita on the suicide rates of the country. We do this to substantiate our intuition that financial stability in a region has a direct correlation with the number of people committing suicides in that region.
Result: We observe that the GDP per capita has no correlation with the number of suicides per capita.
Remark: Though financial stability on an individual level cannot simply be measured by the per-capita income of a particular region, it still gives us a large scale understanding of the financial states of the residents of a particular region. With a large enough data-set, we can, hence, hope to gauge the general effect of income on the suicide rates.
Let’s observe if the Happiness Index has some connection with financial stability or whether it shows a similar lack of trend as suicide rates.
From our previous analysis, if we want to compare the trend in GDP per capita vs The happiness index, we observe the following plot.
We observe that the happiness index is strongly correlated with the GDP per capita as opposed to the trend observed in the suicide rates. We can further substantiate the result by testing this on various continents:
To understand the impact of Health conditions and life expectancy on suicide rates, we can, firstly, establish the claim that GDP per capita and Life expectancy are highly correlated.
To further substantiate the fact that we have a strong correlation between Health Index and GPD per Capita, we can look at the correlation between the fitted values and the observed values.
The correlation between Life Expectancy and GPD per Capita is :
## [1] 0.7816253
As expected the correlation values are really high.
This loosely establishes the fact that financially well-off people tend to have lesser health issues and consequently higher life expectancy. We, however, do not observe such a trend between suicide rates and per-capita income.
If we look at the distribution of “Happinness” across various
continents we come across the following plot:
This leads us to believe that there might be several regional and environmental factors that might also lead to a change in the happiness index and symmetrically the suicide rates of that particular region.
We form the following test to check our intuition.
## [1] "Difference in Mean of happiness index in Upper Hemisphere to Lower Hemisphere : 0.402737286528037"
## [1] "Deviation in the mean difference : 0.0837894696952222"
## [1] "Hypothesis : Mean of Upper Hemisphere <= Mean of Bottom Hemisphere"
## [1] "P value : 7.67830773618501e-07"
Since the p value is very low, the hypothesis is rejected. Therefore, Mean of Upper Hemisphere is very likely to be higher than the Mean of the Lower Hemisphere.
The question that we want to explore in this section is, “Are people from various age groups more likely to commit suicides?”
If we look at the mean values for all the different age groups we can derive the following conclusion:
From this data we can conclude that with increasing age one is increasingly likely to commit suicide.
We can also try and analyse how prone each generation is to committing suicides.
We can try and analyse a similar understanding in the happiness scores around the globe in these similar generations.
Silent Generation – 1928-1945
Boomers – 1946-1964
Generation X – 1965 - 1980
Millennials – 1981-1996
Generation Z – 1997-2012
We can extrapolate the Happiness Index by using the linear model that we derived in the earlier section to check whether a generation more likely to commit suicide was less happy on an average. We use the per capita income to estimate the life expectancy to then carry out the following analysis.
We observe that even in this form of clustering, the hypothesis is not followed.
With the following plots we aim to understand the growth or dip in suicide rates for various continents over time. We do this to try and arrive at a possible correlation between suicide rates, time and region.
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
In Europe, Australia and Africa, suicide rate decreases post 2000s, and in North America, South America, there is a slight increase. In Asia, it is very fluctuating.
Overall we still aren’t able to correlate suicides with any parameter.
With that motivation set, we can look at our final merged data-set that helps us perform cross analysis of all of our parameters.
We then plot the happiness scores vs. suicide rates to arrive at some possible correlation between the two quantities.
## [1] "Correlation - 0.0214553875000274"
As can be seen, there is very little correlation between happiness index and suicide rate, which is a surprising result. This goes completely against our original intuition that happy people don’t commit suicides and vice-versa.
Having proved that happiness and suicide rates are uncorrelated, we can go on to attempt to predict happiness values based purely on GDP and life expectancies. Obviously there are a lot more features, not present in this data , that might impact Happiness, but if the predicted values are fairly close to reported values, then this lends confidence to the model but also validates the analysis that it is independent of suicide rates.
## [1] "Mean Absolute Percentage error of the SVM model = 9.38326673577373"
## [1] "Mean Squared Error of the SVM model = 0.35975755938666"
## [1] "Mean Absolute Percentage error of the SVM model with PCA = 9.58733977594366"
## [1] "Mean Squared Error of the SVM model with PCA = 0.384836038351249"
Thus the SVM model (without implementing PCA) gives us a MAPE of about 0.09 and a MSE of about 0.36. So the model is able to predict happiness scores pretty well, just purely based on GDP and life expectancy, lending credibility to the two metrics as a good indicator of quality of life and hence reported happiness of the people.
In our above analysis we’ve tried to establish a correlation of Suicide rates with multiple intuitive parameters and failed to establish a clear relationship between any factor that felt logically linked to suicide rates.
We also realized that Happiness score, a value derived from population survey as very clear correlation with the same factors (namely GDP-per-capita, Life Expectancy, etc.) Hence, it became possible for us to quantify happiness in some sense of the word.
Finally, when we compared the trends between suicide rates and happiness, we are clearly able to establish the fact that suicides are more or less independent of how happy a certain group of people are.
Hence, the idea that “Happy people do not commit suicides and vice-versa” is not substantiated by real world data.